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In  this  paper  we  demonstrate  the  feasibility  of  providing  a  spelling  corrector  in  most  interactive 
user  interfaces.  The  issues  in  using  spelling  correction  in  a  user  interface  arc  examined  and  a 
simple  correction  algorithm  is  described.  Wc  describe  the  resulrs.of  an  experiment  in  which  the 
corrector  is  incorporated  into  a  heavily  used  interactive  program.  More  than  one  quarter  of  the 
errors  made  by  users  during  ajur  experiment  were  corrected  using  our  simple  mechanisms.  From 
this  we  have  concluded  that  there  are  considerable  benefits  and  few  obstacles  to  providing  a 
spelling  corrector  in  almost  any  interactive  user  interface.  ^  ' _ 
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1.  Introduction 

Ihc  automatic  detection  and  correction  of  spelling  errors  in  prose  has  received  a  considerable  amount  of 
attention  (an  annotated  bibliography  is  given  by  Peterson  [7]).  However,  users  spend  a  considerable  amount 
of  time  typing  commands  to  the  user  interfaces  of  programs,  and  making  similar  typographical  errors  to  those 
made  while  entering  prose.  Although  designing  and  implementing  a  well  human-engineered  and  tolerant 
user  interface  requires  considerable  effort,  it  is  possible  that  some  basic  techniques,  such  as  correcting  the 
spelling  of  keywords,  can  be  applied  at  low  cost.  In  discussing  desirable  attributes  of  good  user  interfaces, 
Hayes.  Ball,  and  Reddy  (4)  assert  that  spelling  correction  in  typical  interactive  programs  is  straightforward, 
since  usually  an  error  is  made  in  a  context  where  only  a  limited  number  of  keywords  (fewer  than  100)  are 
applicable.  If  spelling  correction  is  really  “straightforward”  then  there  is  no  reason  why  almost  all  user 
interfaces  should  not  provide  such  a  facility,  even  to  the  extent  of  retrofitting  a  corrector  into  existing 
programs.  Morgan  [6]  describes  a  spelling  corrector  that  was  implemented  in  both  an  operating  system  and  a 
compiler,  for  which  the  statement  of  purpose  was: 

The  goal  of  the  proposed  spelling  correction  techniques  is.  roughly  stated,  to  achieve  a 
proficiency  comparable  to  that  of  a  quick  scan  of  the  source  program  by  an  experienced  program¬ 
mer  who  has  no  knowledge  of  the  program,  and  who  makes  no  attempt  to  understand  its  purpose. 

There  seem  to  be  very  few  systems  that  take  advantage  of  such  a  facility.  The  most  well  known  such  facility  is 
DWIM  (Do  What  I  Mean)  in  Interlisp  (9J,  whose  statement  of  purpose  is  almost  the  same  as  Morgan’s. 

The  purpose  of  our  investigation  was  to  learn  what  issues  distinguish  spelling  correction  in  user  interface 
applications  from  the  more  general  problem  of  manuscript  spelling  correction  and  to  learn  how  much  of  a 
contribution  such  a  facility  could  make.  We  also  wanted  to  get  a  realistic  picture  of  the  errors  users  really 
make  and  to  supplement  the  data  presented  by  Damerau  in  1964  [1],  To  this  end  we  considered  the  desirable 
characteristics  for  a  spelling  corrector  for  an  interactive  program,  and  developed  a  variation  on  Damerau’s 
algorithm  meeting  these  requirements.  We  incorporated  the  spelling  corrector  into  the  command  table 
module  used  by  the  RdMail  electronic  mail  system  [5],  which  is  in  heavy  daily  use  by  a  community  of  several 
hundred  researchers.  RdMail  has  a  conventional  typescript  oriented  command  language,  where  most  com¬ 
mands  consist  of  a  verb  followed  by  a  sequence  of  arguments.  Most  user  terminals  arc  low-bandwidth  “glass 
teletypes,”  which  can  display  24  lines  of  80  characters.  To  learn  about  the  issues  and  to  find  out  what  kinds  of 
mistakes  arc  made  by  users,  wc  collected  data  on  the  accuracy  of  the  keywords  entered  by  users  and  the  ability 
of  the  program  to  offer  corrections  when  keywords  were  not  recognized  immediately.  Wc  collected  general 
information  about  the  use  of  the  command  interface  and  recorded  specific  information  about  the  keywords 
that  were  not  recognized  and  what  corrections  were  made. 

Throughout  the  paper  wc  will  illustrate  issues  with  examples  of  a  user  interacting  with  RdMail.  User 
type-in  is  underlined  to  distinguish  it  from  RdMail  output.  The  symbol  “CR"  represents  the  carriage  return 
key.  Italics  arc  our  comments  and  explanations,  not  part  of  the  typescripts. 
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2.  Design  Issues 

Wc  were  prompted  to  look  for  a  new  correction  algorithm  because  of  several  differences  between  correct¬ 
ing  spelling  in  general  manuscripts  and  correcting  spelling  in  user  interface  applications.  Design  considera¬ 
tions  for  the  spelling  corrector  fell  into  two  categories:  those  affecting  the  design  of  die  algorithm,  and  those 
affecting  its  use  in  an  interactive  system. 

2.1.  Algorithm  Design  Issues 

Wc  chose  the  same  set  of  assumptions  about  typographical  errors  as  Gorin  did  for  the  PDP-10  SPELL 
program  (3J.  Wc  assume  that  there  is  exactly  one  error  in  the  symbol  to  be  corrected  and  that  die  error  arises 
from  one  of  die  four  causes  that  account  for  over  80%  of  spelling  errors  [1J: 

•  Transposition  of  two  adjacent  letters. 

•  One  letter  wrong. 

•  One  extra  letter. 

•  One  letter  missing. 

These  errors  arc  illustrated  in  Figure  2-1. 
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(d)  Missing  Letter  in  Abbreviation 
Figure  2-1:  Keyword  matching  patterns 


To  minimize  the  difficulty  of  modifying  programs  to  use  the  new  corrector,  we  decided  to  transparendy 
replace  a  library  keyword  look-up  routine  with  a  version  that  did  spelling  correction.  The  original  symbol 
table  module  accepts  as  parameters  an  unsorted  vector  of  strings  and  a  single  probe  string  to  match  against  the 
elements  of  the  vector.  The  look-up  algorithm  allows  the  probe  to  be  a  unique  initial  substring  of  a  table 
entry  and  reports  if  the  probe  is  ambiguous. 
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The  specification  of  the  original  library  module  illustrates  three  differences  between  spelling  correction  in 
manuscripts  and  spelling  correction  in  user  interfaces: 

•  It  is  common  to  allow  abbreviations  in  a  user  interface,  to  minimize  typing. 

•  The  probe  being  looked  up  in  a  table  may  match  several  entries. 

•  Affix  (suffix  and  prefix)  analysis  is  not  necessary  since  the  legal  symbols  come  from  a  very  limited 
vocabulary. 

2.2.  User  Interaction  Issues 

A  number  of  issues  arise  when  a  user  interface  is  supplemented  with  a  spelling  corrector.  These  issues 
concern  the  interactions  with  the  user  when  spelling  correction  is  attempted  and  can  interfere  with  the  user’s 
ability  to  work  with  the  interface.  These  issues  are  discussed  below. 

If  the  user’s  symbol  contains  only  one  character,  the  “extra  letter”  test  would  omit  this  character  from  the 
dictionary  search  and  would  therefore  match  all  the  words  in  the  dictionary  (or  none  of  them,  depending  on 
the  semantics  of  an  empty  string),  which  does  not  help  the  user.  Similarly,  the  “wrong  letter”  test  would 
match  any  word  beginning  with  any  other  character  in  the  symbol  alphabet  We  chose  to  report  that  no 
match  had  been  found  in  this  case. 

If  the  user’s  symbol  contains  only  two  characters,  the  various  tests  for  diagnosing  errors  may  still  produce  a 
substantial  number  of  possible  matches.  The  “transposition”  test  and  “missing  letter”  test  are  reasonable  and 
behave  the  same  for  two  character  symbols  as  for  longer  symbols.  However,  the  “extra  letter”  and  “wrong 
letter”  tests  can  produce  a  large  number  of  possible  matches.  Suppose  the  user’s  symbol  is  “xy”  then  the 
“extra  letter”  test  would  match  all  symbols  beginning  with  either  “x”  or  with  “y".  The  “wrong  letter”  test 
would  match  all  words  beginning  with  “x?”  and  “?y”,  where  “?”  matches  any  character.  The  designer  must 
decide  whether  the  size  of  the  set  of  possible  matches  in  these  cases  is  sufficiently  small  to  permit  the  user  to 
choose  one  of  them  or  whether  to  behave  as  i/iough  no  match  had  been  found.  Wc  initially  chose  to  omit 
both  the  “extra  letter”  and  “wrong  letter”  tests  for  two  character  symbols.  After  a  few  months  of  operation  we 
included  the  “wrong  letter”  test  and  received  many  complaints  that  the  spelling  corrector  offered  too  many 
choices,  most  of  which  were  quite  unexpected;  this  supports  the  original  decision  to  omit  the  test 

Suppose  that  the  correction  algorithm  finds  exactly  one  matching  symbol.  Is  it  safe  to  assume  that  the 
correction  is  accurate?  In  general,  the  answer  is  no,  because  the  user  may  have  made  an  error  (or  multiple 
errors)  not  detected  by  the  four  tests.  An  example  of  this  is  to  omit  the  space  between  two  keywords.  The 
designer’s  decision  must  be  based  on  the  consequences  of  using  the  symbol  in  error.  The  following  example  is 
quite  harmless: 


4 


SPELLING  CORRECTION  IN  USER  INTERFACES 


<-hJssCR 

%  I  assume  you  mean  'Help*  Instead  of  *hlep’. 

Help  text  is  output  here. 

<- 

However,  the  consequences  of  assuming  the  accuracy  of  a  particular  correction  may  be  much  more  serious,  as 
we  demonstrate  in  the  following  contrived  example: 

<-overwiteCR 

X  I  assume  you  mean  ’Overwrite*  Instead  of  ’overwlte’. 

Program  proceeds  to  expunge  deleted  messages. 

<- 

The  unfortunate  user  did  not  mean  “Overwrite”,  he  actually  meant 
<-d over  witeCR 

Program  sends  file  “W1TE"  to  the  Dover  xerographic  printer. 

<- 

In  RdMail  this  problem  is  avoided  by  requiring  confirmation  before  some  irreversible  action  is  taken,  even  if 
the  user  did  not  make  a  spelling  error;  these  mechanisms  are  entirely  outside  the  spelling  corrector. 


Suppose  next  that  the  engineering  decision  is  to  require  confirmation  of  all  spelling  corrections.  In  a 
human-engineered  system,  the  actions  taken  by  the  system  should  require  little  effort  by  the  user.  For 
example  in  offering  particular  default  answers  to  questions,  the  common  response  to  accept  the  default  is 
simply  to  hit  one  key,  carriage  return: 

<-Ua*  A.n^! 

onto  file?  [LPT :MAIL] :  « 

Program  lists  all  messages  onto  the  line  printer. 

<- 

In  the  previous  example  we  see  that  the  user  can  still  suffer  from  prior  training  to  hit  “CR”  in  response  to 
defaults  offered  by  the  system: 

<-overwiteqt 

X  Do  you  mean  ’Overwrite’  Instead  of  ’overwlte’?  [Yes]:  ® 

Program  proceeds  to  expunge  deleted  messages. 

<- 


Clearly  the  default  response  when  the  user  typed  “hlep"  instead  of  “Help”  could  safely  have  been  “Yes”, 
while  in  the  "Overwrite”  example  it  would  have  been  dangerous.  It  is  important  that  the  default  responses  for 
spelling  correction  be  consistent  to  prevent  serious  mistakes.  Although  it  might  be  more  frustrating  for  the 
user  to  have  to  say  “yCR”  to  accept  “Help”  for  “hlep”,  it  is  certainly  better  than  losing  information  in  the 
“Overwrite”  example.  Hence,  the  safe  version  of  the  previous  example  is: 
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<-overwi te^ 

%  Oo  you  mean  ’Overwrite*  Instead  of  ’overwite’?  [No]:  “ 
No  damage  this  time 
<-dover  witeCR 
System  prints  file  “WITE’\ 

<- 


A  reasonable  compromise  may  be  to  assume  the  accuracy  of  corrections  for  which  the  consequences  are 
reversible  (flagging  such  dictionary  entries,  for  example)  and  request  confirmation  of  those  for  which  the 
consequences  arc  not  reversible.  A  simple  variant  of  this  last  option  is  used  in  the  current  version  of  RdMail: 
Confirmation  is  required  for  corrections  made  in  some  dictionaries  but  not  others.  For  example,  main 
command  corrections  are  assumed  to  be  accurate  since  all  actions  are  further  confirmed  or  are  reversible. 
However,  confirmation  is  required  when  an  error  is  made  in  naming  a  program  for  RdMail  to  run  as  a 
sub-job.  Assuming  the  wrong  program  name  could  have  irreversible  consequences,  such  as  deleting  files.  A 
RdMail  user  may  set  an  option  to  always  request  confirmation  of  spelling  corrections. 

When  the  correction  algorithm  finds  more  than  one  matching  symbol  in  the  symbol  table,  die  designer 
must  decide  whether  or  not  the  user  should  be  given  the  opportunity  to  select  the  correct  symbol  from  the 
smaller  set  of  matching  symbols.  He  must  also  decide  whether  to  invest  the  effort  in  further  reducing  the  size 
of  the  set  of  matching  symbols  by  using  heuristic  factors,  such  as  die  reladve  positions  of  characters  on  the 
keyboard.  We  included  no  such  heuristics  and  simply  offered  the  user  all  matching  symbols.  The  data 
collected  during  our  experiment  showed  that  between  two  and  ten  alternative  corrections  were  offered,  with 
the  majority  of  cases  producing  between  two  and  four  alternatives. 

X  ’ecx'  could  be  any  one  of  the  following: 

Echo,  Exit 

Which  one  do  you  mean?  [None  of  the  above]:  ® 

<- 


An  alternative  strategy  for  handling  common  ambiguities  is  to  provide  preferred  disambiguations.  For 
example,  in  RdMail  “A”  is  presumed  to  mean  "Answer”  instead  of  “ Accept”  "Alias,”  or  “Allocate.”  This  is 
handled  by  a  mechanism  outside  the  spelling  corrector:  the  single  character  command  “A”  is  added  to  the 
command  table  as  a  synonym  for  "Answer.” 

Finally,  what  should  be  done  if  there  is  a  problem  with  the  symbol  supplied  by  the  user  in  response  to  the 
question,  “Which  one  do  you  mean?”  Some  of  our  users  suggested  that  spelling  correction  should  be  applied 
recursively,  but  others  wanted  to  be  able  to  type  in  the  name  of  the  command  they  had  meant  initially.  For 
example  when  a  user  typed  "de"  instead  of  d”  for  "F  .  the  program  offered  as  alternatives  only  those 
commands  for  which  "dc”  is  an  ambiguous  at'  ,  vtation;  the  user  would  probably  prefer  to  respond  with 
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“Edit”  even  though  the  symbol  is  not  in  the  set  offered  by  die  program.  Clearly  a  combination  of  die  two 
could  be  applied.  For  example,  apply  spelling  correction  first,  but  if  that  still  doesn't  produce  an  un¬ 
ambiguous  symbol,  look  for  the  new  symbol  in  die  original  symbol  table  rudicr  dian  in  die  set  of  possible 
corrections.  On  the  other  hand  this  would  require  die  user  to  maintain  a  complex  model  of  what  die  corrector 
is  doing.  For  our  experiment,  we  chose  the  simple  expedient  of  forcing  the  user  to  get  die  symbol  right  rather 
than  making  any  attempt  to  correct  the  correction.  The  program  simply  repeats  the  quesdon: 

<-de  201cr 

X  *de’  could  be  any  one  of  tho  following: 

DeAllocate,  Debug,  Declassify,  Delete 

Which  one  do  you  mean?  [None  of  the  above]:  dl  eCR 

%  ’die’  is  not  an  option. 

Which  one  do  you  mean?  [None  of  the  above]:  delCR 
Program  deletes  message  201. 


The  action  taken  at  a  user  interface  when  all  attempts  fail  to  produce  a  unique  symbol  is  not  specific  to 
those  interfaces  that  use  spelling  correction.  However,  a  couple  of  simple  actions  should  be  mentioned.  First, 
the  command  containing  the  erroneous  symbol  may  be  aborted.  Having  tried  our  best  to  make  sense  of  the 
command,  we  must  ultimately  give  up  since  it  will  be  much  easier  for  the  user  to  express  himself  more 
accurately.  This  approach  was  used  in  RdMail. 

<-Aaaarraahh!CR 

?  No  such  command  as  Aaaarrgghh.  Type  ?  for  help. 

<- 

A  slightly  more  sophisticated  approach  is  to  ask  the  user  to  correct  the  symbol  and  then  attempt  to  continue 

with  the  comma"  J  (Notice  that  two  errors  are  detected  in  die  command): 

<-haedsrs  from  Robertson  intersect  week  "Mav  16’’CR 
%  I  assume  you  mean  ’Headers’  instead  of  ’haeders’. 

?  ’week’  is  not  a  Message  Sequence  keyword. 

Message  Sequence  keyword  [Abort  command]:  s jnceCT 
Program  lists  headers  of  messages  from  Robertson  since  May  16. 

<- 

This  mechanism  was  added  to  RdMail  after  our  experiment 


3.  The  Correction  Algorithm 

The  comparison  of  two  symbols  (the  user’s  and  a  dictionary  symbol)  is  done  in  three  parts,  as  illustrated  by 
the  three  divisions  of  each  example  in  Figure  2-1: 

1.  Find  the  common  initial  sub-string  (i.e.  up  to  the  first  difference).  Case  distinctions  in  letters  may 
be  ignored:  the  algorithm  must  find  all  possible  matches  for  the  symbol  not  found  by  the  initial 
search. 


2.  Examine  the  next  pair  of  letters  for  a  transposition  error. 
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3.  Match  the  tail  substrings.  If  die  previous  step  suggested  that  two  characters  had  been  transposed, 
omit  two  characters  from  both  symbols  and  matcli  the  remaining  substrings.  For  the  “wrong 
letter"  test  omit  one  character  from  each  symbol  and  match  die  tail  sub-strings.  For  the  “extra 
letter"  (“missing  letter”)  test,  omit  one  letter  from  die  user’s  (dictionary)  symbol  and  match  the 
remaining  tail  substrings. 


The  matching  steps  arc  repeated  for  each  symbol  in  the  dictionary.  Each  dictionary  symbol  that  is  matched 
is  added  to  a  set  of  possible  corrections  for  the  user’s  symbol.  If  dicre  is  only  one  member  in  the  set  when  all 
of  the  symbols  in  die  dictionary  have  been  examined,  that  symbol  may  be  offered  as  the  correction.  If  there 
are  several  symbols  in  the  set,  the  user  may  be  asked  to  select  one. 


We  originally  tried  the  SPELL  program’s  strategy  of  searching  the  symbol  table  for  each  string  that  could 
be  transformed  into  the  user’s  symbol  by  one  of  die  four  kinds  of  errors.  This  was  far  too  slow  with  the 
original  library  look-up  algorithm. 

Using  only  the  lengths  of  the  user’s  symbol  and  the  dictionary  symbol,  two  optimizations  can  be  made  to 
avoid  unnecessary  string  comparisons: 

1.  If  the  length  of  the  user’s  symbol  exceeds  the  length  of  the  dictionary  symbol  by  more  dian  one 
character,  no  match  is  possible  with  the  above  algorithm,  so  the  “no  match”  result  can  be  returned 
immediately.  (This  assumes  that  the  string  length  is  readily  available.) 

2.  If  the  first  difference  found  is  in  the  last  character  of  the  user’s  symbol,  the  “extra  letter”  test 
would  discard  the  character  and  therefore  convert  the  user’s  symbol  into  a  matching  initial  sub¬ 
string  of  die  dictionary  symbol. 

This  algorithm  requires  no  intermediate  string  construcdon.  The  only  additional  storage  required  is  for  the 
set  of  matched  dictionary  entries,  which  can  be  represented  as  a  vector  of  booleans  with  one  element  per 
dictionary  element  At  the  end  of  the  algorithm  the  set  of  matching  symbols  is  identified  by  all  true  elements 
in  the  set  vector.  The  algorithm  also  has  the  advantage  of  being  trivial  to  implement.  The  most  expensive 
components  are  the  low-level  functions  diat  locate  die  first  difference  and  match  substrings.  In  our  ex¬ 
perimental  implementation  both  of  diesc  functions  were  written  in  assembly  code  using  straightforward 
charactcr-by-character  comparisons. 

An  example  implementation  of  this  algorithm  in  Ada1  is  given  in  Appendix  I.  This  implementation 
exploits  Ada’s  facility  for  dynamically  dimensioned  arrays.  In  languages  which  lack  this  facility,  other  data 
structures,  such  as  linked  lists  or  large  fixed-size  arrays,  can  be  used  instead. 


1 


Ada  [2]  is  a  registered  trademark  of  the  U.S.  Department  of  Defense  (OUSDRE-AJPO). 
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3.1 .  Implementation  of  the  Corrector 

The  main  spelling  correction  algorithm  was  implemented  in  SAIL  [8],  an  Algol-based  language  that 
provides  strings  as  a  primitive  data-type.  There  is  an  extensive  library  of  SAIL  functions  at  C-MU  that 
includes  a  command-table  abstraction  and  a  more  primitive  symbol-table  abstraction.  It  is  in  the  command 
table  module  that  the  spelling  corrector  was  applied  depending  on  the  result  of  the  table  search  by  the 
symbol-table  function.  The  corrector  was  applied  only  when  the  keyword  was  not  found  in  the  symbol  table. 

To  handle  multiple  matching  symbols,  the  corrector  builds  a  table  (vector)  of  string  pointers  and  calls  a 
subroutine  which  asks  the  user  to  select  the  correct  keyword.  The  subroutine  forces  the  user  to  be  accurate  in 
selecting  one  keyword  from  the  set  offered.  In  particular,  no  attempt  is  made  to  correct  the  selected  keyword. 

3.2.  Performance 

The  following  informal  analysis  shows  that  die  spelling  correction  algoridirn  is  quite  adequate  for  our 
requirements  even  though  it  clearly  is  not  optimal  for  die  general  correction  application  in  prose.  What  it 
lacks  in  performance  is  returned  in  simplicity  that  facilitates  its  introduction  in  a  wide  variety  of  applications. 

To  correct  one  symbol  given  a  dictionary  of  N  symbols,  our  algorithm  performs  N  initial  substring  matches 
and  at  most  4N  tail  substring  matches.  (The  transposition  tail  match  is  performed  only  if  the  “transposition” 
test  succeeds.)  At  worst,  this  is  equivalent  to  4N  equality  string  matches,  plus  a  small  constant  overhead  per 
dictionary  element. 

To  get  a  more  concrete  measure  of  the  cost  of  spelling  correction,  we  performed  some  measurements  of  the 
algorithm’s  running  time.  Measurements  were  taken  on  a  lightly  loaded  DECsystem-102  KL-10  processor 
(the  same  one  used  in  the  RdMail  experiment  described  in  Section  4).  In  each  case  the  data  were  obtained  by 
running  5000  tests  in  a  loop,  subtracting  the  original  value  of  the  system  clock  from  the  final  value,  subtract¬ 
ing  loop  overhead  and  dividing  by  5000.  Look-ups  were  done  on  a  table  of  66  entries,  a  subset  of  the  main 
command  table  from  the  RdMail  program.  The  original  command  table  had  77  entries;  we  eliminated  five 
punctuation  character  commands  and  six  commands  where  transposing  the  first  tw'o  letters  results  in  an 
ambiguity.  Measurements  were  taken  of 

a.  The  time  to  look  up  a  correct  entry  with  the  original  library  module  (without  spelling  correction) 
and  with  the  new  module  (with  spelling  correction). 

b.  The  time  taken  to  decide  a  probe  is  not  in  the  table  with  each  of  the  two  modules.  Five  sources  of 
failing  keys  were  chosen: 

2DECsystcm- 10  is  a  trademark  of  Digital  Equipment  Corporation. 
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1.  Character  strings  of  the  form  “aaaaa”.  “bbbbb”,  and  so  on. 

2.  Failing  keywords  collected  during  the  experiment  described  in  Section  4.  All  keys  longer 
than  two  characters  were  included. 

3.  The  numeric  keys  from  the  experiment. 

4.  The  alphabetic  keys  from  die  experiment 

5.  Strings  of  the  form  "KEYxxx”,  where  KEY  is  a  command  from  die  main  table. 

These  times  do  not  include  the  time  taken  to  print  an  error  message;  printing  times  arc  reported 
separately.  We  also  report  the  difference  between  the  new  and  the  old  look-up  times,  which 
represents  die  time  taken  by  the  spelling  corrector,  and  this  difference  divided  by  the  table  size, 
which  roughly  represents  the  cost  per  table  entry.  The  last  number  varies  because  of  the  optimiza¬ 
tions  mentioned  in  Section  3,  which  can  reject  some  symbols  quickly. 

c.  The  time  taken  to  handle  a  transposition  of  the  first  two  characters  of  a  command.  A  command 
was  chosen  and  its  first  two  letters  were  transposed  before  calling  the  routine.  For  the  original 
module  this  results  in  a  failing  look-up;  die  new  module  corrects  this  error. 

d.  The  time  taken  to  print  a  message  of  the  form  “I  assume  you  mean  X  instead  of  Y.” 

e.  The  time  taken  to  print  a  message  of  the  form  “X  is  not  a  command.” 

The  results  arc  illustrated  in  Figure  3-1. 


Time  (milliseconds) 

Test 

Original 

New 

Difference 

Per  Table 

look-up 

look-up 

Entry 

Succeeding 

1.781 

1.797 

Failing 

“aaaaa”,  etc. 

3.100 

24.728 

21.628 

.328 

real  data 

3.123 

22.060 

18.937 

.287 

real  numeric 

3.162 

25.613 

22.451 

.340 

real  alphabetic 

3.122 

21.943 

18.821 

.285 

KEYxxx 

2.990 

12.385 

9.395 

.142 

Transposition 

failing 

3.221 

correcting  error 

28.384 

Print  "I  assume..  "  message 

5.749 

Print  ”...  not  a  command"  message 

4.594 

Figure  3-1 ;  Measurements  of  the  Corrector 


We  conclude  that  the  algorithm  described  above  is  a  good  choice  for  those  applications,  such  as  user 
interfaces,  in  which  the  size  of  die  dictionary  is  quite  small  and  abbreviations  must  be  handled.  For  the 
library  symbol  table  module  used  in  the  experiment  we  cannot  do  much  better  since  the  specification  for  the 
lookup  routine  docs  not  require  that  the  table  be  sorted. 
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4.  The  Experiment 

The  principal  purpose  of  the  informal  experiment  described  in  this  section  was  to  learn  how  useful  a 
spelling  corrector  might  be  in  an  interactive  user  interface.  In  asking  about  the  errors  made  by  users,  our 
attention  is  confined  to  finding  symbols  in  the  various  symbol  tables  used  by  the  program;  we  explicitly 
exclude  semantic  and  syntactic  errors  in  composing  commands  except  as  they  arc  detected  by  failing  to  find  a 
symbol  in  a  particular  table.  The  efficacy  of  die  correction  facility  depends  on  the  variety  of  errors  users  make 
and  their  respective  frequencies.  The  rate  of  errors  made  determines  the  overall  cost  of  the  spelling  correction 
facility.  We  need  to  determine  what  proportion  of  those  errors  can  be  attributed  to  typographical  errors  that 
may  be  recovered  by  the  corrector.  Learning  what  other  errors  are  made  may  suggest  other  ways  to  improve 
the  tolerance  of  the  user  interface  to  user  errors. 

The  RdMail  Message  Management  System  [5]  was  used  for  the  experiment.  We  describe  the  program  and 
the  information  we  recorded  from  it  nexL  Then  we  present  the  results  of  the  experiment  and  offer  a  brief 
evaluation  of  die  effectiveness  of  our  mechanisms  in  the  light  of  those  results. 

4. 1 .  The  RdMail  System 

RdMail  is  an  interactive  electronic-message  management  system  that  provides  facilities  for  the  composition 
and  transmission  of  messages  among  users  of  a  network  of  computers.  Messages  received  can  be  classified, 
answered,  and  filed  conveniently.  RdMail  commands  are  sequences  of  keywords  and  parameters,  where  the 
parameters  arc  usually  numbers  or  quoted  strings.  Users  may  operate  on  sets  of  messages  by  specifying  the 
particular  messages  by  number,  attribute  (such  as'date  of  arrival,  name  of  originator,  or  subject),  or  user- 
defined  classification.  For  example, 

<-headers  meetnqs  •intersect  (since  "Jan  intersect  SPilTS0* 

%  Oo  you  mean  ‘Meetings’  instead  of  ’meetngs’?  [No]: 

produces  a  brief  identification  of  all  messages  between  50  and  175  that  arrived  since  January  1  and  were 

classified  as  “Meetings”.  Since  the  user  defines  the  names  of  classifications,  such  as  “Meetings”,  the  set  of 

symbols  in  the  table  used  in  the  parsing  of  message  set  specifications  is  dynamic. 

The  user  leaves  RdMail  either  temporarily  to  use  an  editor  to  alter  the  composition  of  a  message,  or 
permanently  to  return  to  the  system's  executive  program.  For  the  purposes  of  the  experiment,  a  session 
encompasses  only  those  commands  that  are  given  before  the  user  leaves  RdMail  for  any  reason.  Hence  one 
user  “session”  in  which  a  message  arrives,  an  answer  is  composed  then  edited  before  being  mailed,  and  finally 
new  mail  is  sent  to  other  people  would  be  considered  two  sessions  in  the  collected  data:  the  first  before 
entering  the  editor;  the  second  after  returning  from  the  editor  and  before  leaving  the  program  permanently. 

For  the  duration  of  the  experiment,  RdMail  forced  users  to  confirm  all  corrections  suggested  by  the 
program.  This  was  the  only  alteration  in  the  RdMail  user  interface  specifically  for  this  experiment. 
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4.2.  The  Data  Collected 

Because  of  the  sensitive  position  of  RdMail  in  communicating  between  users  we  recorded  no  data  that 
could  be  traced  directly  to  particular  users.  This  anonymity  was  also  important  to  reduce  the  probability  that 
users  would  become  self-conscious  about  making  typographical  errors  and  take  more  care  than  usual.  We  felt 
morally  obliged  to  warn  people  that  we  were  performing  an  experiment,  and  to  give  them  the  option  of 
running  a  different  program  to  avoid  participating  in  the  experiment.  Warning  users  that  the  experiment  was 
to  be  conducted  encouraged  a  few  to  entertain  us  with  some  colorful,  if  illegal,  keywords.  In  some  cases  it  was 
apparent  that  users  were  probing  the  limits  of  RdMail’s  correction  facility.  However,  we  cannot  be  sure  of 
any  particular  user’s  intention  and  have  therefore  included  the  apparently  intentional  errors  in  our  results. 

The  data  recorded  for  each  RdMail  session  were: 

•  The  number  of  commands  given  to  RdMail  (both  from  the  keyboard  and  from  pre-existing  files). 

•  The  number  of  keywords  for  which  RdMail  searched  symbol  tables  and  the  number  of  those 
keywords  that  were  not  found  or  were  ambiguous. 

For  each  symbol  that  was  not  uniquely  matched  in  a  particular  symbol  table,  a  detailed  record  was  made 
including: 

•  The  symbol  the  user  provided. 

•  The  correct  symbol,  if  any,  as  confirmed  by  the  user. 

•  The  number  of  possible  corrections  for  the  symbol  identified  by  the  spelling  corrector. 

•  The  identity  of  the  symbol  table. 

•  The  approximate  execution  time  taken  to  identify  the  corrections  that  could  be  made  (rounded  to 
the  nearest  millisecond). 

Gathering  additional  data,  such  as  the  entire  command  line  containing  each  unrecognized  symbol,  would 
have  aided  us  in  determining  the  causes  of  uncorrected  errors.  We  decided  not  to  do  this  on  account  of  the 
need  to  respect  privacy. 

<-headers  from  Bovlk  Intersect  subelct  "oav  raise"1* 

We  shouldn't  learn  that  Bovik  is  up  for  a  pay  raise  jus !  because  someone  misspelled  “subject”. 

The  reason  we  chose  RdMail  as  our  experimental  vehicle  in  spite  of  this  inconvenience  is  that,  with  the 

possible  exception  of  various  operating  systems.  RdMail  has  by  far  the  most  heavily  used  "command  line” 

style  interface  in  our  environment  The  other  heavily  used  programs  are  either  compilers,  which  arc  not 

interactive,  or  text  editors,  which  use  mostly  single-character  commands. 
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4.3.  Usage  Statistics 

The  experiment  ran  for  41  days  during  which  time  a  total  of  2.1361  RdMail  sessions  were  recorded. 
RdMail  processed  a  total  of  145,972  commands  during  the  experiment;  140,038  from  terminals  and  5,934 
from  command  files.  Data  from  batch  jobs  were  discarded  because  we  were  interested  only  in  human 
typographical  errors,  not  general  RdMail  use.  RdMail  handled  a  total  of  455,811  keywords  during  the 
experiment,  averaging  three  per  command.  The  distribution  of  numbers  of  keywords  in  sessions  is  shown  in 
Figure  4-1. 


Figure  4- 1 :  Keywords  in  a  session 

The  running  time  for  identifying  the  set  of  possible  corrections  varied  considerably,  ranging  up  to  31 
milliseconds  with  an  average  of  9.5ms,  but  with  a  relatively  large  standard  deviation,  a  =  9.5ms.3  The  total 
time  used  by  the  spelling  corrector  over  the  41  days  of  the  experiment  was  19.2  seconds,  an  average  of  468 
milliseconds  per  day. 


The  average  time  spent  in  the  spelling  corrector  for  the  subset  of  the  data  used  as  "real  kc>s"  in  Section  3.2  was  I9.7ms.  This  agrees 
well  with  the  spelling  corrector  cost  ( 19. 1  ms)  shown  in  Figure  3- 1 .  Many  erroneous  keys  were  processed  more  quickly  because  they  were 
shorter  than  three  characters  or  because  they  were  looked  up  in  small  tables. 


SPELLING  CORRECTION  IN  USER  INTliKl'ACFS 


13 


4.4.  Results 

During  the  41  days  of  our  experiment,  RdMail  encountered  2527  erroneous — Le.,  not  uniquely 
identifiable — keys  (0.554%  of  all  symbols  entered).  Due  to  an  oversight  in  the  data-collcction  routines,  we 
cannot  determine  how  many  keys  came  from  command  files.  Four  percent  of  commands  came  from  com¬ 
mand  files.  Even  if  die  number  of  keywords  per  file  command  were  an  order  of  magnitude  greater  than  the 
average  number  of  keywords  for  all  commands,  the  error  rate  for  manually  entered  keys  would  be  only 
0.934%,  which  is  still  very  small.  Actual  error  rates  may  be  higher,  since  we  cannot  tell  how  often  a  user 
noticed  an  error  and  corrected  it  manually  (by  backspacing  over  die  error  or  deleting  the  input  line,  and 
re-typing)  before  hitting  carriage  return  to  enter  die  command.  The  erroneous  keywords  were  recorded  in 
two  different  classes:  keywords  not  found  in  the  symbol  table  and  keywords  that  were  ambiguous.  The 
distribution  of  diese  errors  is  shown  in  Figure  4-2. 


Unmatched  Keys 

Number  of 

Ambiguous  Keys 

Number  of 

Per  Session 

Sessions 

Per  Session 

Sessions 

0 

21800 

0 

22905 

1 

1293 

1 

423 

2 

181 

2 

28 

3 

44 

3 

3 

4 

27 

4 

2 

5 

7 

6 

2 

Total  unmatched  keys 

2031 

7 

3 

Total  ambiguous  keys 

496 

8 

0 

9 

2 

10 

1 

40 

1 

Figure  4*2:  Keywords  in  Error 


By  examining  the  data  collected  for  each  erroneous  key,  we  arrived  at  the  taxonomy  of  errors  shown  in 
Figure  4-3.  The  percentage  figure  in  parentheses  after  each  class  of  error  gives  the  size  of  that  class  in  relation 
to  the  entire  class  of  2527  recorded  instances  of  erroneous  keys. 

Corrected  errors  (27%) — Transposition  error  corrected.  Missing  letter  restored  Wrong  letter  corrected  and 
Extra  letter  removed  (16%);  Ambiguity  resolved  (l l%)\  The  error  recovery  mechanism  offered  potential 
corrections  (24%)  or  disambiguations  (20%)  for  44%  of  all  erroneous  keys.  However,  users  did  not  always 
accept  corrections  and  disambiguadons  when  they  where  offered.  Only  56%  of  the  ambiguities  detected  were 
resolved  by  the  user -accepting  one  of  the  alternatives  offered,  and  users  accepted  spelling  corrections  in  only 
66%  of  the  cases  where  one  or  more  potential  corrections  were  offered.  The  errors  thus  resolved  accounted 
for  27%  of  all  erroneous  keys.  For  13%  of  all  erroneous  keys,  the  corrector  offered  a  single  correction  that  was 
accepted  by  the  user. 
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•  AH  erroneous  keys  (100%) 

o  Corrcctcd/disambiguatcd  keys  (27%) 

•  Ambiguity  resolved  (11%) 

•  Typographical  (16%) 

o  Transposition  error  corrected  (2.7%) 
c  Missing  letter  restored  (4.8%) 
o  Wrong  letter  corrected  (4.5%) 
o  Extra  letter  removed  (4.0%) 
o  Uncorrcctcd  keys  (73%) 

•  Alphabetic  (46.6%) 

o  One  character  (10.4%) 
o  Two  character  (9.9%) 
o  Three  or  more  characters  (26.3%) 

•  Typographical?  (2.9%) 

o  Missing  space? 
o  Missing  carriage  return? 
o  Missing  slash? 
o  Control  key? 
o  Typeahead? 
o  Multiple  typo? 
o  Miscellaneous  typo? 

•  Non-typographical?  (23.4%) 

o  Intentional  error?  (3.2%) 

o  Good  correction  or  disambiguation  rejected?  (0.4%) 
o  Syntax  or  vocabulary  error?  (19.8%) 

•  Non-alphabetic  (26.4%) 

o  Control  character  (12.9%) 
o  Number  (4.6%) 
o  Punctuation  (8.9%) 

Figure  4-3:  Taxonomy  of  Errors 

Speculative  classifications  are  marked  with  “?”. 


In  some  eases,  accurate  disambiguations  or  corrections  may  have  been  rejected  accidentally.  We  arc  unable 
to  say  how  often  this  happened  because  the  privacy  constraints  on  our  experiment  prevented  us  from  record¬ 
ing  sufficient  information  to  determine  which  rejected  disambiguations  and  corrections  were  in  fact  accurate. 

Errors  with  two  or  more  explanations  were  assigned  to  the  first  of  the  above  categories  into  which  they  fit. 
For  example,  if  the  erroneous  key  "die”  was  corrected  to  "Deleted",  we  accounted  for  the  error  as  the 
transposition  of  “1”  and  “e”,  rather  than  as  the  omission  of  an  "e”  or  the  inclusion  of  a  spurious  “I”. 

Uncorrcctcd  alphabetic  keys  (73%):  Of  the  1845  uncorrected  erroneous  keys,  1179  (46.7%  of  all  erroneous 
keys)  were  "alphabetic"— that  is,  consisted  of  a  letter  followed  by  zero  or  more  letters  or  digits.  Of  these,  264 
(10.4%  of  all  erroneous  keys)  were  single  letters,  249  (9.9%)  were  only  two  characters  long,  and  666  (26.4%) 
were  three  or  more  characters  long.  We  manually  classified  the  666  “multicharacter”  (>2-character)  uncor¬ 
rcctcd  alphabetic  keys.  Since  we  had  to  rely  on  educated  guesswork  for  this  classification,  it  is  possible  that 
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wc  incorrectly  classified  some  of  the  keys.  We  have  indicated  this  possibility  by  placing  question  marks  by  the 
names  of  the  manually-generated  subclasses. 

Typographical  errors  (2.9%) — Missing  space.  Missing  carriage  return.  Missing  slash.  Control  key, 
Typeahead,  Multiple  typo.  Miscellaneous  typo'.  We  attributed  74  of  the  666  erroneous  multi-character  al¬ 
phabetic  keys  (2.9%  of  all  erroneous  keys)  to  typographical  errors  of  sorts  not  corrected  by  our  algorithm. 
Perhaps  the  most  obvious  sort  of  error  in  this  category  is  the  omission  of  a  space  between  two  keywords  ( e.g „ 
typing  “numnew"  instead  of  “num  new"  to  ask  for  the  message  numbers  of  all  new  messages,  or  “hdel” 
instead  of  “h  del"  to  ask  for  the  headers  of  ail  deleted  messages).  A  similar  kind  of  error,  but  one  whose 
existence  we  might  not  have  guessed  without  seeing  some  examples,  is  the  missing  carriage  return.  An 
example  is  the  key  “exitbb”,  almost  surely  typed  by  a  user  who  intended  to  type  an  “Exit”  command  to  leave 
RdMail  and  then  type  “BIT  to  the  operating  system  to  read  an  electronic  bulletin  board.  In  one  case  we 
diagnosed  an  erroneous  key  as  resulting  from  a  missing  “/”.  On  our  system,  the  control  character  CTRL-S  is 
used  to  suspend  output  to  the  terminal,  an  action  that  is  useful  to  prevent  long  messages  from  scrolling  ofF  the 
screen  faster  than  the  user  can  read:  typing  a  ctrl-Q  causes  output  to  resume.  If  the  CTRL  key  on  a  terminal 
is  broken,  or  if  die  user  doesn’t  have  his  finger  on  it— some  of  our  terminals  have  keyboards  with  REPEAT  in 
the  same  position  where  others  have  CTRL — the  result  may  be  an  erroneous  key  such  as  “ssssty”  (instead  of 
“ty”  to  type  a  message).  Another  feature  of  our  system  is  that  terminals  run  in  full  duplex  mode,  allowing  the 
user  to  enter  additional  commands  while  waiting  for  the  machine  to  respond  to  earlier  commands.  Since  such 
"typeahead”  may  not  be  echoed  immediately  or  may  be  echoed  in  the  middle  of  a  lot  of  output  it  is  possible 
for  a  user  to  forget  how  far  ahead  he  has  typed.  An  example  of  an  erroneous  key  that  is  probably  due  to  this 
phenomenon  is  “typetype".  Presumably  the  user  keyed  in  die  command  “Type"  while  waiting  for  the 
previous  command  to  finish,  then  forgot  that  he  had  done  so  and  keyed  it  in  again.  We  attributed  eight 
erroneous  keys  to  multiple  typos.  Two  examples  are  “ancr”  (instead  of  “Answer"  to  reply  to  a  message)  and 
“hbok”  (instead  of  “h  book"  to  type  the  headers  of  all  messages  in  the  user-defined  message  class  “book”). 
Finally,  there  were  several  erroneous  keys  which  appeared  to  result  from  problems  with  the  mechanics  of 
keying  in  commands  but  for  which  wc  could  not  confidently  specify  a  most  probable  cause.  It  is  interesting  to 
note  that  353  of  the  corrected  keys  (excluding  disambiguated  keys)  were  three  or  more  characters  long. 
Assuming  that  our  count  of  74  typographical  errors  among  the  uncorrccted  multicharacter  alphabetic  keys  is 
accurate,  this  means  that  83%  (353  out  of  353+74  =  427)  of  all  typographical  errors  resulting  in  multicharacter 
alphabetic  erroneous  keys  were  in  the  four  classes  handled  by  the  corrector.  This  is  in  agreement  with 
Damcrau's  (l)  experience  that  these  four  classes  account  for  over  eighty  percent  of  all  spelling  errors. 

Non-typographical  multicharacter  alphabetic  erroneous  keys  (23.4 %)— Intentional  error.  Good  correction  or 
disambiguation  rejected.  Syntax  or  vocabulary  error.  In  addition  to  the  74  multicharacter  alphabetic  keys  that 
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wc  could  diagnose  as  typographical  errors,  there  were  80  keys  (3.2%  of  all  erroneous  keys)  that  appeared  to  be 

intentional  errors  and  10  eases  (0,4%)  in  which  we  were  reasonably  confident  that  accurate  corrections  or 

disambiguations  were  rejected  by  users.  The  intentional  errors  included  messages  to  the  authors  of  the 

spelling  corrector  ( e.g .,  “hithcre”,  and  “doyoureallymcanyoucantfigurcouthbok”),  strings  which  appeared  to 

result  from  use  of  the  keyboard  as  a  pacifier  (e.g.,  “kkklknlkn”),  and  a  sequence  of  twenty-eight  consecutive 

misspellings  of  the  command  “Put”  ("up”,  “tup”,  “tpu”  "spot”,  etc.)— presumably  generated  by  a  user  who 

was  probing  (/>.,  playing  with)  the  spelling  corrector.  This  leaves  502  (19.9%)  legitimate  multicharacter 

alphabetic  erroneous  keys,  which  we  must  presume  were  due  to  errors  above  the  typographical  level — /.&, 

syntax  and  vocabulary  errors.  Broadly  speaking  in  these  cases  the  user  either  forgot  the  appropriate  keyword, 

used  a  keyword  that  would  have  been  recognized  in  some  other  context,  or  induced  a  parsing  error  by 

omitting  a  symbol  and  thereby  leaving  an  operand  keyword  where  an  operator  was  expected  (or  vice-versa). 

The  following  examples  arc  typical: 

<-haaders  from  Durham  since  R-mar0* 

illegal  message  sequence  at  "SINCE"  -  junk  at  end 

FROM  DURHAM  SINCE  3-MAR 

t 

•c-headers  from  Durham  Intersect  since  S-mar01 
Program  prints  headers  of  messages  from  Durham  dated  later  than  March  3. 

<-klob/aCR 

?No  such  command  as  kjob.  Type  ?  for  help. 

<-exitCR 

EXIT 

,k  lob/a01 
Logged  off  CMUA. 

Non-alphabctic  erroneous  keys  (26.4%) — Control  character.  Number,  Punctuation'.  The  non-alphabctic 
erroneous  keys  included  325  control  characters  (12.9%  of  all  erroneous  keys),  116  numbers  (4.6%).  and  225 
punctuation  marks  (8.9%).  Among  the  control  characters,  the  most  common  by  far  was  CTRL-S,  which 
occurred  181  times  {12%  of  all  erroneous  keys).  As  wc  mentioned  earlier,  this  character  is  used  on  our  system 
to  suspend  output  to  a  terminal  temporarily.  Normally,  the  user  types  ctri  -Q  to  cause  output  to  resume. 
However,  typing  a  second  CTRL-S  while  output  is  suspended  will  cause  output  to  resume,  but  the  operating 
system  will  pass  the  second  CTRL-S  to  the  program's  input  stream.  If  a  user  types  ctrl-S,  but  output  doesn't 
stop  immediately  (because  the  load  on  the  system  is  impairing  response  time),  he  may  type  a  second  ctrl-S, 
thereby  inadvertently  sending  a  erRL-S  to  RdMail.  Wc  believe  that  this  phenomenon  accounts  for  all,  or 
almost  all,  the  observed  occurrences  of  CTRL-S  as  an  erroneous  key.  In  some  eases  CTRl  S  might  have  been 
intended  as  sinr-T-S,  but  these  eases  alone  can  hardly  account  for  die  great  frequency  of  CTRI  -S  compared  to 
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other  control  characters.  Given  RdMail’s  command  syntax,  we  would  have  expected  numbers  and  punctua¬ 
tion  marks  to  appear  most  frequently  as  erroneous  keys  in  die  middle  of  long  commands.  Surprisingly,  84% 
of  the  numbers  and  51%  of  the  punctuation  marks,  as  well  as  92%  of  the  control  characters  other  than  CTRL-S, 
occurred  as  the  first  symbols  of  die  commands  in  which  they  were  detected  as  erroneous  keys.  We  have  no 
solid  explanation  for  this  phenomenon. 

4.5.  Evaluation 

Our  mechanism  handled  27%  of  the  erroneous  keys  entered  during  die  experiment.  Examination  of  the 
remaining  73%  led  us  to  wonder  what  other  mechanisms  might  permit  further  corrections  while  retaining  the 
typescript-style  interface.  Most  of  die  other  errors  seemed  specific  to  the  operating  system  (TOPS-IO)  or 
application  (RdMail).  Although  there  docs  not  seem  to  be  a  mechanism  as  general  as  die  spelling  corrector 
for  handling  these  errors,  we  believe  that  developing  an  “expert”  level  of  friendliness  requires  paying  atten¬ 
tion  to  this  sort  of  detail. 

Since  the  ambiguous  key  “D”  was  almost  always  disambiguated  into  “Delete,”  adding  “D”  to  the  main 
command  table  as  a  synonym  for  “Delete”  would  remov  e  3.7%  of  the  errors.  Ignoring  the  character  CTRL-S, 
or  treating  it  as  a  space,  could  eliminate  7.2%  of  the  errors.  Since  we  believe  most  of  these  occur  because  of 
attempts  to  suspend  typeouL  diis  seems  reasonable.  Ignoring  all  control  characters  could  account  for  a  further 
5.7%,  but  further  study  is  needed  to  determine  why  these  errors  occur. 

Errors  caused  by  typeahead  might  be  reduced  by  not  echoing  characters  until  the  application  requests 
input  as  is  done  on  TOPS-20.  This  might  actually  increase  error  rates,  since  users  would  not  be  able  to  see 
their  typeahead.  Our  data  indicate  that  typeahead  errors  are  very  infrequent. 

A  portion  of  the  syntax  and  vocabulary  errors  (19.8%)  and  numeric  errors  (4.6%)  may  be  due  to  omitted 
keywords,  or  to  the  user  forgetting  the  context.  These  errors  may  be  amenable  to  the  techniques  described  by 
Hayes,  Ball,  and  Reddy  (4).  Some  syntactic  errors  might  be  handled  by  the  recovery  techniques  used  in 
compilers,  or  might  be  eliminated  by  modifications  to  the  grammar.  For  example,  after  the  experiment  we 
made  a  small  modification  to  the  grammar  for  RdMail  message  sequences  so  that  a  user  may  omit  the 
keyword  “intersect.” 

Finally,  there  arc  some  errors  that  do  not  seem  to  admit  any  reasonable  automatic  recovery.  For  example, 
if  a  user  tries  to  classify  a  message  as  “ICs”  (a  user-defined  class  for  messages  regarding  integrated  circuits), 
when  the  name  of  the  class  is  actually  “chips”,  the  best  that  can  be  done  is  to  allow  the  user  to  choose  among 
the  names  of  all  of  his  classifications. 
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5.  Conclusion 

The  spelling  corrector  offered  a  unique  acceptable  correction  for  13%  of  the  keyword  errors  detected 
during  the  experiment.  In  a  further  3%  of  the  eases  it  found  multiple  possible  corrections,  one  of  which  was 
accepted  by  the  user.  Allowing  the  user  to  correct  ambiguities  manually  fixed  a  further  1 1%  of  the  keyword 
errors. 

The  correction  algorithm  is  very  simple  to  implement  and  costs  us  about  half  a  second  per  day  for  a  heavily 
used  interactive  system.  The  corrector  was  invoked  about  50  times  a  day  at  an  average  cost  of  about  10 
milliseconds.  RdMail  has  since  been  modified  to  apply  the  corrector  to  ambiguities  as  well  as  to  symbols  that 
arc  not  in  the  symbol  table  at  all.  If  we  project  with  our  data,  the  invocation  rate  increases  to  about  60  times 
per  day.  The  data  clearly  support  the  premise  that  spelling  correction  is  “straightforward”  in  user  interface 
applications.  The  most  complex  part  of  the  engineering  is  selecting  the  behavior  of  the  system  with  the  results 
of  the  correction  algorithm. 

It  is  interesting  that,  in  response  to  repeated  requests  by  certain  users,  the  RdMail  maintainers  have 
provided  options  for  suppressing  all  of  the  extra  warnings  and  confirmations  normally  produced  when  some 
irreversible  action  is  about  to  occur.  Such  users  are  vulnerable  when  the  corrector  changes  a  typographical 
error  into  a  valid,  irreversible  command.  The  mistakes  made  by  experts  appear  intuitively  to  be  caused  by 
rapid  typing  and  extensive  use  of  abbreviations,  while  less  experienced  users  tend  to  use  full  command  names 
and  make  the  more  common  typographical  errors. 

We  have  installed  the  command  module  that  uses  the  spelling  corrector  in  die  standard  SAIL  library  at 
C-MU.  As  a  consequence,  any  program  that  uses  the  library  module  acquires  the  spelling  correction  facility 
the  next  time  diat  it  is  link-edited.  The  number  of  programs  that  now  routinely  provide  spelling  correction 
without  any  action  at  all  on  the  part  of  their  author  or  maintainer  is  growing  slowly. 

We  conjecture  that  the  spelling  correction  facility  and  algorithm  described  in  this  paper  would  be  equally 
beneficial  in  both  operating  system  environments  (interactive  and  batch)  and  compiler  applications,  where 
computing  resources  might  be  conserved  by  continuing  computations  that  might  otherwise  be  aborted,  only 
to  be  repeated  later.  In  particular,  we  are  somewhat  surprised  that  the  work  described  by  Morgan  in  1970  (6] 
has  not  found  wider  application  today.  We  foresee  no  significant  technical  difficulties  in  implementing  our 
algorithm  in  a  variety  of  languages.  Perhaps  our  results  can  convince  programmers  to  provide  this  simple, 
cheap,  and  effective  facility  in  new  and  even  existing  user  interfaces. 
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I.  Example  Implementation  of  the  Spelling  Corrector 

The  following  Ada  implementation  illustrates  the  functions  required  for  the  spelling  correction  algorithm. 
The  code  was  compiled  by  the  Intcrmctrics  Ada  Prototype  Compiler  and  executed  on  a  DF.Csystcm-20.4  This 
example  uses  the  1980  version  of  Ada,  since  a  compiler  for  1982  Ada  was  not  available  to  us  at  the  time  of 
publication. 


—  Example  implementation  of  the  Spelling  Corrector  in  DoD  Ada. 

—  This  code  is  operational.  However,  to  improve  the  clarity  of  this  example, 
--  we  have  omitted  the  detailed  interactions  with  the  user.  We  have  excluded 

—  the  routines  that  interact  with  the  user  (User_Accepts ,  User_Selects)  and 

—  have  commented  out  their  invocation. 


package  Spal l ing_Corrector  Is 

Not_Correctable:  exception;  —  Raised  if  no  corrections  found. 

type  Symbol_Table  Is  array( Integer  range  <>)  of  str1ng(l. .32); 

—  The  Correct_Spel 1 ing  function  delivers  the  index  in  the  table  of 
--  the  corrected  symbol  or  raises  the  Not_Correctabl e  exception. 

function  Correct_Spel  1  ir.g(ST:  In  Symbol_Table; 

User_Word:  In  string; 

Assume.Correct:  In  boolean)  return  Integer; 

end  Spel 1 ing_Corrector ; 

with  text_io;  --  Meed  an  I/O  package 

use  text.io; 

package  body  Spel 1 ing_Corrector  Is 

—  The  Same_Character  function  returns  true  iff  the  two  characters  are 
--  the  same  when  case-differences  are  ignored. 

function  Same_Character(A,B:  In  character)  return  boolean  Is 

Folded_A,  Folded_8:  Integer;  —  Case  folded  character  positions 
Case_Oifference:  constant  Integer  :*  character 'P0S(  a’)  -  character ’POS( ’A' ) ; 
begin 

Folded_A  :«  character ’ P0S( A) ; 

If  A  In  'A* . . ’Z'  then 

Folded_A  :*  Folded_A  +  Case_Dif ference;  --  Upper  to  Lower  case  conversion 

end  If; 

Folded_B  :*  character 1 POS(B) ; 

If  B  In  A' . .  r  then 

Folded_B  :«  Folded_B  +  CaseJJif  ference;  --  Upper  to  Lower  case  conversion 

end  If; 

return  Folded_A  *  Folded_B; 
end  Same_Character; 
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--  The  First_Difference  function  locates  the  first  character 

—  position  at  which  the  two  parameter  strings  differ  (ignoring  case 

—  distinctions).  Zero  is  returned  if  either  string  is  empty. 

function  First_0if ference(A,B:  In  string)  return  Integer  Is 
Last_Index:  Integer; 
begin 

If  A’LENGTH  <«  B'LENGTH  then  —  Find  shorter  string 

Last_Index  A’LENGTH; 

else 

Last_Tndex  :*  B’LENGTH; 

end  If; 

If  Last.Index  -  0  then  --  One  string  is  empty 

return  0; 
end  If; 

for  i  In  1 . . l.ast_lndex  loop 

If  not  Same_Character( A( i ) ,B( i ) )  then 
return  i; 
end  If; 
end  loop; 

return  Last_Index+l ; 
end  First_Oif ference; 

—  Function  Match_Substring  returns  true  iff  the  second  string  (B)  is 
--  an  initial  Substring  of  the  first  string  (A).  A  is  considered  to 

—  begin  at  index  First_A  and  B  is  considered  to  begin  at  index  First_B. 

function  Match_Substring(A:  In  string;  First_A:  In  natural; 

e:  In  string;  First_B:  In  natural)  return  boolean  Is 

begin 

If  F i rs t_B  >  B'LENGTH  than 

return  true;  --  8  is  empty  sub-string 

elsif  (First_A  >  A’LENGTH)  or  ( (B ’ LAST-F irs t_B)  >  ( A' LAST-F irst_A) )  then 
return  false;  —  A  is  empty  or  B  is  too  long. 

end  If; 

for  i  In  0. . ( B '  L  AS  T  -  F  i  r  s  t_8 )  loop 

If  not  Same_Character( A( i+F irst_A) ,D( i+First_B) )  then 
return  false; 
end  If; 
end  loop; 

return  true; 

end  Match_Substring; 
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—  Function  Possible_Correction  returns  true  iff  one  of  the  four  tests 
--  applied  to  the  user  word  yields  the  dictionary  word. 

function  Possible_Correction(Oictionary_Word,  User_Word:  in  string)  return  boolean  Is 
Index:  Integer; 
begin 

--  Heuristic:  Can't  match  if  symbol  is  more  than  one  character  longer  than  dictionary  word. 

If  (User_Word' LENGTH  -  1)  >  Oictionary_Word'LENGTH  then 
return  false; 
end  If; 

—  Step  1:  Find  the  index  of  the  first  different  characters 
Index  :=  First_Dif ference(Dictionary_Word,User_Word) ; 

—  Heuristic:  Assume  wrong  letter  if  difference  at  end  of  word 

If  (Index  »  User_Word ’LENGTH)  and  ( User_Word ' LENGTH  >  2)  then 
return  true; 
end  If; 

—  Step  2:  Check  for  transposed  characters  &  tail  match 

If  ( (Diet ionary_Word ’ LAST  >  Index)  and  ( User_Wor o ’ LAST  >  Index))  and  then 
( Same_Charac ter ( Di c t ionary_Word( Index ) . User_Word( Index+1))  and 
Same_Character (Oic t ionary_Word( Index+1 ) ,User_Word( Index ) )  and 
Match_Subs  tring(  Diction  ary  _Word  .  ( Index+2)  ,‘Jser  _Word .  ( Index+2) )  )  then 
return  true;  --  Transposition, 
end  If; 

--  Step  3:  Apply  remaining  tail  Substring  matches 

If  Match_Substr ing(Dictionary_Word . (Index+1 ) .User_Word , Index)  then 
return  true;  --  Missing  letter. 

end  If; 

--  Policy:  Don't  try  other  tests  on  2-character  strings. 

If  UserJWord’ LENGTH  =  2  then 

return  false; 
end  If; 

If  Match_Substr ing( Dictionary_Word , Index , User_Word ,( Index+1 ) )  then 
return  true;  --  Extra  letter, 

end  If; 

If  Match_Substr ing(0ict ionary_Word . (Index+1 ) ,User_Word , ( Index+1 ) )  then 
return  true;  --  Wrong  letter. 

end  If; 

return  false: 

end  Poss ible_Correction; 


SPELLING  CORRECTION  IN  USER  INTERFACES 


23 


function  Correct_Spel1 ing(Sf:  In  Symbol_Table; 

User_Word:  In  string; 

Assume_Correct:  In  boolean)  return  Integer  Is 
Match_Count,  Last_Match:  Integer  :■  0; 

Match_F l ag :  array  (ST'range)  of  boolean; 

Test_Word:  str1ng( 1 .. User_Word ’ LENGTH)  :»  User_Word; 

begin 

for  i  In  ST'range  loop 

Match_Flag( i)  :■  Possible_Correction(ST( i) ,Test_Word) ; 

If  Match_Flag( i)  then 

Match_Count  :■  Match_Count  +  1; 

Last_Match  :»  i;  ~ 

end  If; 
end  loop; 

If  Match_Count  *  1  then 
If  Assume_Correct  then 

Put_Line(  "X  I  assume  you  mean  &  ST(Last_Match)  &  ” '  instead  of  &Te$t_Word& 
return  Last_Match; 

elslf  User_Accepts(ST(Last_Match) ,Test_Word)  then 

—  Ask  Do  you  mean  ’x1  instead  of  ’y’?  question, 
return  Last_Match; 
end  If; 

elslf  Match_Count  >  1  then 

Put_Line("%  &  Test_Word  &  could  be  any  one  of:"); 
return  l/ser_Selects(ST,Test_Word,Match_Flags); 

end  If; 

raise  Not_Correctable; 
end  Correet_Spel 1 ing; 
end  Spel 1 ing_Corrector ; 


